Skip to main content

Chroma

Summary

This document covers the information to gather from Chroma in order to configure a Qarbine data service. The data service will use the Qarbine Chroma driver. You can define multiple data services that access the same Chroma though with varying credentials. Once a data service is defined, you can manage which Qarbine principals have access to it and its associated Chroma data. A Qarbine administrator has visibility to all data services.

Overview

Chroma is an open source embedding database for multi-modal AI applications. Qarbine interfaces with both a local Chroma instance and the Chroma Cloud offering. More information can be found at https://www.trychroma.com/.

Chroma Configuration

Cloud Parameters

For Qarbine to access your Chroma Cloud data it needs the following:

  • tenant UUID,
  • endpoint,
  • API key, and
  • database.

For Chroma based embedding additional information will be needed as described below based on the embedding service.

Navigate to https://trychroma.com and log on.

Click on

  

The default page is shown.

  

Copy the tenant UUID into a temporary location.

  

Activate the API keys tab.

  

If you do not have an API key, then click   and fill in the form. An example is shown below.

  

Click    to complete the process.

  

Copy the API key into a temporary location.

Embedding Service

The Qarbine server options can include your AI embedding service information. The recognized case sensitive names are:

  • openAiApiKey *,
  • googleApiKey,
  • cohereApiKey,
  • huggingfaceUrl and
  • jinaApiKey *.

The embedding service API key is used to define Chroma’s embedding support function which takes text as input to perform tokenization and embedding. For the asterisk services above, you also specify an embedding model using the “embeddingModel” option. For more information on embeddings see https://docs.trychroma.com/guides/embeddings.

Gather this information for the next step below.

Qarbine Configuration

Compute Node Preparation

Determine which compute node service endpoint you want to run this data access from. That URL will go into the Data Service’s Compute URL field. Its form is “https://domain:port/dispatch”. A sample is shown below.

  

The port number corresponds to a named service endpoint configured on the given target host. For example, the primary compute node usually is set to have a ‘main’ service. That service’s configuration is defined in the ˜./qarbine.service/config/service.main.json file. Inside that file the following driver entry is required

"drivers" :[
. . .
"./driver/chromaDriver.js"
]

The relevant configuration file name for non primary (main) Qarbine compute nodes is service.NAME.json. Remember to have well formed JSON syntax or a startup error is likely to occur. If you end up adding that entry then restart the service via the general command line syntax

pm2 restart <service>

For example,

pm2 restart main

or simply

pm2 restart all

Data Service Definition

Open the Administration Tool.

Navigate to the Data Services tab.

  

A data service defines on what compute node a query will run by default along with the means to reach to target data. The latter includes which native driver to use along with settings corresponding to that driver. Multiple Data Sources can reference a single Data Service. The details of any one Data Service are thus maintained in one spot and not spread out all over the place in each Data Source. The latter is a maintenance and support nightmare.

Click

  

On the right hand side enter a name and optionally a description.

  

Set the Compute URL field based on the identified compute node above. Its form is “https://domain:port/dispatch”. A sample is shown below.

  

Also choose the “Chroma” driver.

  

For the Chroma Cloud the server template is based on the Chroma Cloud API endpoint. An example is shown below.

  

You can reference environment variables using the syntax %NAME%. Any strings should be quoted and the key\value pairs separated by commas.

These values are double quoted and separated by a comma as shown in the sample below.

  

For more information see https://docs.trychroma.com/deployment/auth.

Test your settings by clicking on the toolbar image highlighted below.

  

The result should be

  

Save the Data Service by clicking on the image highlighted below.

  

The data service will be known at the next log on time. Next, see the Chroma oriented query interaction and any tutorial for information on interacting with Chroma Cloud from Qarbine.

References

More information can be found at https://www.trychroma.com/.